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Abstract. - In this paper, we study two large data sets containing the information of two different 
human behaviors: blog-posting and wiki-revising. In both cases, the interevent time distributions 
decay as power-laws at both individual and population level. As different from previous studies, 
we put emphasis on time scales and obtain heterogeneous decay exponents in intra- and inter- 
day range for the same dataset. Moreover, we observe opposite trend of exponents in relation to 
individual Activity. Further investigations show that the presence of intra-day activities mask 
the correlation between consecutive inter-day activities and lead to an underestimate of Memory, 
which explain the contradicting results in recent empirical studies. Removal of data in intra-day 
range reveals the high values of Memory and lead us to convergent results between wiki-revising 
and blog-posting. 
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Introduction. — Thanks to the development of 

t-H the information technology, comprehensive data available 

^ from the internet give us valuable insights into the pattern 

of human behaviors. Many recent studies of human behav- 

ior focus on the distributions of inter-event time or waiting 

^f^time and report a heavy-tail both at the individual and 

population level. Examples of empirical studies includ- 

^^ing communication patterns of electronic mails [2HHfT7] 

^>and surface mail [HE], web surfing [6,7 , short message [8], 

t-H movie rating [9] . 
* * 

^ , In all the above systems, the observed distributions of 
JJ"! interevent time goes as r a with exponents ranging from 
r^j 1 to 3. Various mechanisms were suggested to explain 
C^the underlying dynamics. One main class of mechanism 
is the priority-queue model [2,17^, which yields power-law 
waiting-time distributions p(r) = r a with universal expo- 
nents a=1.0 and 1.5. Other mechanisms include the adap- 
tive interesting model [11] , the memory model [12] and the 
interaction model [T5] . A crucial assumption of all these 
models and empirical studies is that the mechanisms driv- 
ing human behaviors are identical in all time scales. Ac- 
cording to this assumption, interevent time with length in 
minutes and in days are generated by the same mechanism 
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and follow the same scaling law. Even in the cascading 
nonhomogeneous Poisson process [3JH] which emphasizes 
external factors such as circadian and weekly cycles, the 
distributions still follow power-laws with identical expo- 
nent over the whole range. 

Table 1 shows a collection of recent empirical results, 
including the exponents and the unit of interevent time 
and the time range where the power-laws were observed. 
In this table, we simply classify the results into intra- and 
inter-day behaviors. As we can see, for those data with 
unit in second or minute, the studies are often focused on 
the intra-day interevent time distribution; for those with 
unit in hour or day, only the inter-day range was stud- 
ied. None of these studies investigated both the intra and 
inter-day behavior, though some noticed a hump in the in- 
terevent time distribution caused by the circadian rhythm 
pH] . One case that had been studied intensively is email 
and letter based communications, where some studies sug- 
gested that mechanisms of the two activities are different, 
based on the different exponents observed [2]; others sug- 
gested that the two are essentially the same based on the 
data collapse of interevent time distributions [4]. Lim- 
ited attention has been paid on the different time range 
in the two activities, as the timestamp of email and let- 
ters communications are respectively in the unit of second 
and day, and exponents are thus extracted from different 
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Human activity 


Unit 


Range 


Exponent 


Email [2lfT2] 


sec 


intra-day 


1*, 0.9 


Correspondence |12[ll5j 


day 


inter- day 


2.37A, 2.1A 


Library loans [2] 


min 


intra-day 


1* 


Printing behavior [16] 


sec 


intra-day 


1.3A 


Visits of a web portal [2] 


sec 


intra-day 


1* 


Visits to the same URL [6] 


sec 


intra-day 


1 


Visits to any page [6] 


sec 


intra-day 


1.25 


Queries on AOL [7] 


hour 


inter- day 


1.9 


Message on Ebay [7] 


hour 


inter- day 


1.9 


Logging actions on Wikipedia [7] 


hour 


inter- day 


1.2 


Movie rating [9] 


day 


inter- day 


2.08 



Table 1: Comparison of the exponents from different human activities. The unit of interevent times and the time range in these 
studies are shown in the table. * corresponds to the average of exponents from individual distributions; A corresponds to the 
exponent from a single user; others are the ones from global distribution 



time range. By comparing the activities in the table, we 
find that the inter-day exponents tend to be clearly higher 
than the intra-day ones: the exponents of four of the five 
inter-day activities are around or more than 2; all the ex- 
ponents of the six intra-day activities are around or a little 
more than 1. 

It is, of course, insufficient to prove the above relation- 
ship only by comparison between the exponents of differ- 
ent activities. We thus aim to bring further evidence in 
this paper. Our work is based on two data sets from dif- 
ferent sources which record two kinds of human activities: 
wiki page revising and blog posting [18]. The heavy-tails 
are found in both intra- and inter-day part of the distribu- 
tions from these two activities. Our results show that even 
for the same activity the exponents of these two ranges are 
different. 

Further evidences are obtained by examining the depen- 
dence of decay exponent on individual Activity, the mea- 
sure of how frequent the action is taken. Zhou et al [9] 
found that the exponent increases with Activity, which 
was further confirmed by Radicchi (7|. It is noted that 
both analyses are conducted in the inter-day range. In our 
case, we found the same dependence in the inter-day range 
but remarkably a different behavior in the intra-day range. 
It further demonstrates that the mechanisms underlying 
intra- and inter-day human dynamics are different. 

On the other hand, weak memory in human behaviors 
are observed in system such as library loans and print- 
ing [19]. However, other studies show significant memory 
in some systems driven by human [16] [18] [20]. For wiki 
page revision, we found seemingly weak memory. How- 
ever, we observe a strong memory comparable to that of 
blog-posting [18] by removal of intra-day intervals and con- 
sider the inter-day ones only. It shows that the memory of 
inter-day activities is underestimated as intra-day activity 
mask the correlation between inter-day activities in anal- 
ysis. We suggest that it is the reason behind the apparent 
weak memory in some human behaviors. 



Data sets description. — 

Wikipedia. Wikipedia (Wiki) is a free encyclopedia 
written in multiple languages and collaboratively created 
by volunteers. Wiki contains millions of articles which is 
produced by tens thousands of online volunteers. When an 
article is revised by an user, a new version is created by this 
user. The database we consider contains the timestamp 
and the authors of all the revisions in the Chinese Wiki. 
This data set is composed of 9641842 revisions made by 
81823 users between 26/10/2002 and 7/6/2009. 
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Fig. 1: (Color online) The global distribution of interevent time 
spanning the intra- and inter- day range, n is frequency. We fit 
the distributions with the "shifted power- law": y ~ (x + a) _/3 
[24] . Figure (a) and (c) shows the distributions of the intra- 
day range of wiki-revising and blog-posting; Figure (b) and (d) 
shows the inter-day range. The decay exponents are (3mins — 
1.88 and hours — 1-32 in (a), f3 m ins — 1-20 and hours — 0.66 
in (c); p ~ 1.57 in (b), f3 ~ 2.02 in (d). 

Blog. Blog is a kind of so-called web2.0 applications 
emerging in recent years, on which people post, read 
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and comment articles from each other [22j[23]. Our data 
was collected from the campus blog website of Nanjing 



university (http://bbs.nju.edu.cn/blogall). Most users are 
current or former students and teachers of Nanjing uni- 
versity. As of 01/09/2009, there are 1627697 articles 
posted by 20379 users in this website. The first post is 
at 25/03/2003 when the blog was established. 

Empirical analysis. — 

The global distribution of interevent time in intra- day 
and inter- day range. The timestamp of both the data 
sets is in precision of one minute. Here, the interevent time 
r is time interval between consecutive actions, i.e. revising 
a wiki-page by the same user in wiki or posting an article 
by the same user in blog. The global distributions of r 
for both data sets are shown in figure 1. As we can see, 
the distributions can be divided into two parts: For the 
intra-day range, the curves clearly show the heavy-tails; 
for the inter-day range, they all show oscillation because 
of the circadian periodicity that make it hard to observe 
the scaling law. 

Even in the intra-day range, the power-law behavior 
is not homogeneous in all time scale and a slight hump 
is observed at r « 100 (see fig. 1(a) and (c) and fig 4 
for clearer evidence). We thus apply a piecewise fitting 
curve to show the change in power-law exponents. For the 
range with r < 100 (within about 1 hour), the exponents 
of blogging and wiki-revising activities are 1.20 and 1.88; 
for the range with r > 100 (beyond 1 hour and within 1 
day), lower values of 0.66 and 1.32 are found. 

Figure 1(b) and (d) shows the distribution of inter-day 
interevent time where a unit of one day is employed to 
eliminate the oscillation. The heavy-tails in the inter-day 
range are shown clearly in these two distribution. The 
exponent of blogging activity is 2.02 which is significantly 
higher than the ones in intra-day range, in agreement with 
the results obtained by comparing different empirical stud- 
ies in table 1. On the other hand, the intra-day expo- 
nent of wiki-revising seem to be close to the inter-day one. 
However, as we will see in following section, the empirical 
analysis at group and individual levels demonstrate the 
different activity pattern between the two ranges. 

Heterogeneous Dependence on Activity. In this sec- 
tion, we will investigate further the features of intra- and 
inter-day activity pattern. Firstly, we measure the average 
Activity Ai of user i as Ai = rii/di, where ni is the total 
number of actions of user i and di is the time between the 
first and the last actions. We then sort users in an ascend- 
ing order of Activity and divide the entire population into 
10 groups, each of which have M users (M « TV/10 where 
N is the total number of users). The first M users in the 
list belong to group 1, and the last M users are belong to 
group 10, etc. We only consider users with n^di > 10. 
For wiki, there are 14410 qualified users and M = 1400; 
for blog, there are 12827 qualified users and M = 1300. 
As different from previous studies [7J[9] which only focus 
on the inter-day range, we investigate the dependence of 
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Fig. 2: (Color online) The interevent time distribution of wiki- 
revising at a group level (group 3 and group 9). For distribution 
in the intra-day range, the range of fitting is from 1 to 70. 
(a) and (c) correspond to the intra-day range and (b) and (d) 
correspond to the inter-day range. The decay exponents are 
P ~ 2.00 in (a),/3 ~ 1.16 in (b),/3 ~ 1.75 in (c),/3 ~ 2.21 in (d). 



the exponent on Activity in both the intra- and inter-day 
range. In fig. 2, we plot the interevent time distribution 
of wiki for group 3 and 9 (which respectively correspond 
to average Activity (A) = 0.07,1.12). For the inter-day 
range, we get the same dependence as the one obtained 
in other inter-day activities : the exponents increase with 
Activity. Some exponents of inter-day activities are small 
such as the one in logging action probably due to the rel- 
atively low activity [7]. For the intra-day range, this de- 
pendence is totally different: the exponents decrease with 
increasing Activity and the change is relatively smooth. 
In fig. 3, we plot the exponent of the interevent time dis- 
tribution of wiki-revising and blog-posting as a function 
of Activity. Though the values of exponents are different 
in these two cases, they show the same features: the ex- 
ponent and Activity are positively related in the inter-day 
part and negatively related in the intra-day one. 
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Fig. 3: (Color online) Dependence of decay exponents on 
Activity. 

Interevent time distribution for individuals. To show 
further evidences for our conjecture, we look in detail the 
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behavior of individual agents. Figure 4 shows the cumula- 
tive distribution of interevent time from four users, two are 
from the data set of wiki and two are from the blog data 
set. An obvious trend change is observed at r « 1 day. For 
the inter-day range, all these distributions follow power- 
laws. The wiki users often revise one page many times 
within a day but blog users seldom post several articles 
in one day. Therefore, it is hard to study the intra-day 
activity of blog-posting at the individual level as data is 
insufficient in this range. For wiki-revising, the distribu- 
tions are even heterogenous within the intra-day range (see 
fig 4(a)), which is consistent with the global one and shows 
further complexity in the mechanism of human activity. 
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Fig. 4: (Color online) The cumulative distribution of interevent 
times of individuals. N is the cumulative frequency of intervals. 
User 1 and User 2 in (a) and (b) are from wiki; User 3 and 
User 4 in (c) and (d) are from blog. The decay exponents 
are f3 m ins — 0.38, hours — 0.11 and hays — 1-23 in (a), 
hours ^ 0.19 and hays ^ 1-57 in (b); f3 ~ 1.22 in (c), f3 ~ 1.13 
in (d) 

The consecutive interevent times of these users are plot- 
ted in fig 5 which helps us to visualize the dynamics of their 
activities. For the blog user (see fig 5(a)), we observe the 
clustering of extremely long interevent times which is also 
called mountain-valley-structure found in many complex 
systems [25j[26]. For the wiki user, fig 5(b) shows similar 
clustering but the interevent time longer than one day are 
separated by many short intra-day interevent times which 
are rare in blog-posting (compared with fig 5(a)). The 
consequence is that the values of Memory become rather 
small. The definition of Memory is as follow [T9] : 



n T — l 

— y 



(t* - mi)(r, 



i+k 



ra 2 ) 



<7i<72 



(i) 



where Ti is the interevent time values and n T is the num- 
ber of interevent time and m\ (772,2) and g\(g\) are sample 
mean and sample standard deviation of r^'s (r^'s). The 
two interevent times n and Ti+k are separated by k events. 
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Fig. 5: The interevent time of consecutive events (a) User 3 
in figure 4c. (b) User 1 in figure 4a. (c) User 1 after deleting 
short interevent times which is less than 1000 mins. 



The Memory Mi) of the blog user is 0.13 but the one of 
the wiki user is only 0.02. 

The average Mk of all qualified users with k ranging 
from 1 to 35 is shown in fig 6. Average Mi of wiki-revising 
is 0.13 which is obviously less than 0.21, the Mi in blog- 
posting. This result is in agreement with the one we found 
in Fig 5(a) and (b). As there are different mechanisms in 
human activity in the intra- and inter-day range, we find a 
way to study the memory of these mechanisms separately. 
We remove the interevent times of wiki-revising which is 
less than 1000 minutes (about 1 day) and analyze the re- 
maining series which only contain the inter-day intervals. 
This allows us to consider only the memory in the inter- 
day intervals and ignore the actions within one day. Figure 
5(c) shows the interevent time series after data removal, of 
which Mi is 0.12. Correspondingly, we also find a signif- 
icant increase in the average Mi of wiki-revising through 
this procedure. As shown in the inset of fig 6, average Mi 
increases to 0.20 which is very close to the one in blog- 
posing. Moreover, the decay curve is similar to that of 
blog-posting: when k < 10, it decays asymptotically as a 
power law; when k > 10, it decreased exponentially. 

Discussion. — We conclude by remarking two con- 
crete evidences which support our conjecture that human 
activity patterns are significantly different in different time 
scale. Firstly, the exponents of interevent time distribu- 
tion is different in the intra- and inter-day range. In ad- 
dition to comparison with the previous empirical stud- 
ies, we show difference at the individual and global level 
by investigating the activity patterns of wiki-revising and 
blog-posting. The second evidence is the different depen- 
dence on Activity: for the inter-day range, the exponents 
increase with Activity; for the intra-day range, the expo- 
nents decrease with Activity and in smaller magnitude. 
On the other hand, we show the behavioral similarity be- 
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Fig. 6: (Color online) The average Mk of all qualified users in 
blog-posting and wiki-revising after data removal with different 
k. The comparison between the results before and after data 
removal is shown in the inset. For the one of blog-posting, 
Mk decays as a power law when k < 10: Mk = 0.23 * k~ 0A5 ; 
when k > 10, it decays exponentially: M k = 0.1 * e " fc/23 - 22 
[18] . For the original data, it decays as a power law over whole 
range:Mfc = 0.13 * k~ 0A7 . After data removal, when k < 9: 
M k = 0.61 * AT ' 21 ; when k > 9: M k = 0.10 * e~ k/12 - 76 . To 
avoid characterizing users whose number of actions is too small, 
we consider only the qualified users of the two data sets and 
calculate the memory of all these users with k ranging from 1 to 
35 (for wiki, a total of 809 users with number of revisions more 
than 800 and frequency of long intervals (> 1000 mins) more 
than 100 are considered; for blog, a total of 2126 users with 
more than 200 posts and frequency of long intervals (> 1000 
mins) more than 200 are considered. 



tween wiki-revising and blog-posting as the same expo- 
nent dependence is observed in corresponding range. This 
similarity further increases after removal of the intra-day 
interevent times of wiki-revising. Previous study reported 
the lack of memory in human activity but our work shows 
that the presence of intra-day activities mask the correla- 
tion between consecutive inter-day activities and lead to 
an underestimate of memory. Can we thus classify hu- 
man activities by the interevent time scale? How to accu- 
rately measure the memory in a series which is complex 
and heavy-tailed? Further investigations are required in 
these directions. 

In our previous studies [18], the personal-preference 
model was suggested to describe blog-posting, which suc- 
cessfully generate the exponent dependence on Activity 
and the significant memory. Here, our analysis further 
shows that the model is suitable for wiki-revising in the 
inter-day range as it shows the same exponent dependence. 
However, there is still no model which can explain the 
negative relationship between the exponents and Activity 
in intra-day range. One possible explanation is the time 
scale in scheduling activities. We can plan our daily sched- 



ule carefully according to our personal preference but we 
hardly plan what to do every minutes. Our actions in min- 
utes are more stochastic which may lead to the smaller 
burstiness in the intra-day range (the exponents in this 
range is often smaller). Though random walk in one di- 
mension [27, 28 can be used to explain interevent time in 
stochastic process, the value of exponent obtained is fixed 
to be 1.5 which does not agree with the present empirical 
result. 

We finally remark again the interesting behaviors in 
both the intra- and inter-day range. There are interesting 
details within both intra- and inter-day range. A slight 
hump is observed in P(r) at r « 1 hours. For inter-day 
range, the decay of memory is power- law when k < 10 and 
became exponential beyond this range. Is there a rela- 
tionship between time units (such as minute, hour, week, 
month) and the dynamics underlying human activities? 
For example, trend change observed in P(r) at one hour 
may due to the timing of tasks in hours. 
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